CALDER 


National 

Center for Analysis of Longitudinal Data in Education Research 


i A" 


TRACKINC EVERY STUDENT’S LEARNING EVERY YEAR 


Urban Institute 


A program of research by the Urban Institute with Duke University, Stanford University, University of Florida, 
University of Missouri-Columbia, University of Texas at Dallas, and University of Washington 



Feeling the 

Florida Heat? 

How Low-Performing 
Schools Respond 
to Voucher and 
Accountability Pressure 

Cecilia Elena Rouse, 
Jane Hannaway, Dan 
Goldhaber, and David Figlio 



WORKING PAPER 13 • NOVEMBER 2007 






Comments Welcome 



Feeling the Florida Heat? 

How Low-Performing Schools Respond to Voucher and Accountability Pressure 



Cecilia Elena Rouse 
Princeton University and NBER 

Jane Hannaway 
The Urban Institute 

Dan Goldhaber 
University of Washington 

David Figlio 

University of Florida and NBER 



November 28, 2007 

This research could not have been undertaken without the help of many people. We thank 
Edward Freeland, Craig Deshenski, Kenneth Mease, Rob Santos, and Fritz Scheuren for their 
exceptional help with the survey and sample development. We appreciate the assistance of the 
Florida Department of Education in providing us both with administrative student data as well as 
with sampling frames for our survey analysis. Jay Pfeiffer, Jeff Sellers and others at the Florida 
Department of Education provided very helpful advice regarding Florida education policy and 
the administrative data used in the analysis. We also thank Emily Buchsbaum, Cynthia Casazza, 
Sarah Cohodes, Joseph Gasper, Scott Mildrum, Radha Iyendar, Ty Wilde, Grace Wong, and 
Nathan Wozny for expert research assistance and Jesse Rothstein, Analia Schlosser, Diane 
Whitmore Schanzenbach and seminar participants at Harvard University, McMaster University, 
the University of Chicago, the University of Florida, and the fall 2007 research conference of the 
National Center for Analysis of Longitudinal Data in Education Research for extremely useful 
conversations and suggestions. Finally, we are indebted to the Annie E. Casey, Atlantic 
Philanthropies, Smith Richardson and Spencer Foundations, the U.S. Department of Education, 
the National Institutes of Health and the National Center for Analysis of Longitudinal Data in 
Education Research (CALDER is supported by IES Grant R305A060018 to the Urban Institute) 
for financial support, but the views expressed in this paper do not necessarily represent those 
organizations supporting this research, the Florida Department of Education, or our host 
institutions. All errors in fact and interpretation are ours. 




Abstract 



While numerous recent authors have studied the effects of school accountability systems on 
student test performance and school “gaming” of accountability incentives, there has been little 
attention paid to substantive changes in instructional policies and practices resulting from school 
accountability. The lack of research is primarily due to the unavailability of appropriate data to 
carry out such an analysis. This paper brings to bear new evidence from a remarkable five-year 
survey conducted of a census of public schools in Florida, coupled with detailed administrative 
data on student performance. We show that schools facing accountability pressure changed their 
instructional practices in meaningful ways. In addition, we present medium-run evidence of the 
effects of school accountability on student test scores, and find that a significant portion of these 
test score gains can likely be attributed to the changes in school policies and practices that we 



uncover in our surveys. 




I. Introduction 

The current national focus on perfonnance-based accountability in K-12 education began 
to develop in the early 1990s in response to dissatisfaction with the performance of U. S. schools 
despite substantial increases in funding (Hanushek, 1994). A conference organized by the 
National Research Council identified the lack of perfonnance incentives in education as the main 
culprit (Hanushek and Jorgenson, 1996). Two views emerged about that time on how to 
introduce performance incentives into education. The first was captured by the work of Chubb 
and Moe (1990) who argued that problems of academic performance result from the regulation 
of schools by public bureaucrats who respond to the interests of organized groups and not to 
interests of students and parents. The solution they proposed was school autonomy, governed by 
market mechanisms. 

The second view, developed by Smith and O’Day (1991), argued that the problem with 
U. S. education was that the system was organized in ways that put it at odds with itself. 
Perfonnance standards, curriculum and student assessment were not purposefully integrated and 
federal and state policies often worked at cross purposes. The solution they argued was 
alignment. In the early stages, much of the emphasis of the standards movement, as it was called, 
was on establishing performance standards and aligning the curriculum with those standards. 
With the exception of a few states and some large school districts, developing new assessments 
and holding schools accountable for tested student outcomes proceeded slowly (Hannaway and 
Kimball, 2001). The passage of No Child Left Behind (NCLB) spurred the development of test- 
based accountability systems as the linchpin of standards-based refonn. 



Market-based systems attempt to provide more accountability by increasing the 
educational choices available to parents. Most states and school districts now provide increased 




2 



options through open enrollment plans or through the provision of independently-operated 
charter schools. In a handful of places, the market-based system has gone a step further allowing 
parents, through a voucher system, to send their children to private school at public expense. 
Test-based incentives include attempts to increase school accountability through the regular 
testing of students, making the results - aggregated to the school level - public, and rewarding 
schools with high or increasing aggregate test scores, and imposing sanctions on poor performing 
schools. Given that accountability ratings are capitalized into housing values (Figlio and Lucas, 
2004), constituents have both educational and financial incentives to pressure poor-performing 
schools to improve. Both the market solution and test-based solution should provide schools 
with an incentive to operate more efficiently - that is, to improve student outcomes without 
significant increases in resources. 

Previous authors have investigated the short-run effects of both market-based and test- 
based accountability on student outcomes, reaching mixed conclusions. 1 Relatively few have 
investigated the actual behavioral responses of schools to these systems. Some of those that do 
find that schools “game the system” to improve test performance without necessarily increasing 
effort or true productivity. For example, there is some evidence that schools respond to 
accountability pressure by differentially reclassifying low-achieving students as learning 
disabled so that their scores will not count against the school in accountability systems (see, e.g., 
Cullen and Reback (2007), Figlio and Getzler (2007), Jacob (2004)) 2 , and Figlio (2006) indicates 



1 Recent nationwide studies by Carnoy and Loeb (2002) and Hanushek and Raymond (2005) find significant 
improvement in student outcomes as a result of standards-based accountability, whereas the results from some 
specific state systems have been less positive (see, e.g, Koretz and Barron (1998), Clark (2003) and Haney (2000, 
2002)). More recent work by Figlio and Rouse (2006), West and Peterson (2006), and Chakrabarti (2006) find 
positive short-run effects of accountability on Florida student outcomes, at least in mathematics. The evidence on 
market-based reforms is also mixed (see, e.g., Howell and Peterson (2002), Krueger and Pei (2004), and Rouse 
(1998) for evidence from voucher programs). 

2 Chakrabarti (2006), however, does not find that schools respond in this way. 




3 



that schools differentially suspend students at different points in the testing cycle in an apparent 
attempt to alter the composition of the testing pool. Jacob and Levitt (2003) find that teachers 
are more likely to cheat when faced with more accountability pressure. 

Surprisingly, there has been little systematic effort to detennine the substantive ways in 
which schools alter their methods of delivering education in response to school accountability 
and school choice pressures (see Hannaway and Hamilton, 2007, for a review). This is an 
important oversight, as some school policy variables have been found to improve student test 
performance. For example, there is convincing evidence that reducing class size can improve 
achievement (see, e.g., Angrist and Lavy, 1999; Krueger, 1999; and Krueger and Whitmore, 
2001) as can summer school and grade retention (Jacob and Lefgren, 2004). 

Does one observe that schools begin to adopt these or other measures in an attempt to 
improve student outcomes? The answer to this question is of key import for several reasons. 
First, if one observes gains in student outcomes without changes in how schools operate, one 
might be suspicious that the gains are not due to improving efficiency of the schools but perhaps 
to other factors, such as the potentially artificial gains (e.g. changing the tested student 
population) described above. Second, most students are likely to remain in public schools 
regardless of the arrangements designed to provide K-12 education. Thus, the majority of the 
nation’s 50 million students will be affected mainly though the institutional response of public 
schools to these, or other, pressures. 

The primary reason for the lack of research on this topic is the lack of data. While 

information on student test scores and a few crude indicators of student body characteristics 

(e.g., race/ethnicity, eligibility for the National School Lunch Program, and disability rates) are 

3 Schools may also try to boost student achievement on testing days. Figlio and Winicki (2005), for instance, suggest 
that Virginia schools facing accountability pressures altered their school meals on testing days to increase the 
likelihood that students will do well on the exams. 




4 



prevalent, there exist no large-scale data sets that systematically describe school instructional 
policies and practices. In this paper we describe research that addresses the gap in research on 
school responses to the pressures introduced by school accountability and school choice systems. 
We do so using an extraordinary data set we have generated. During the 1999-00, 2001-02, and 
2003-04 school years, we administered surveys to every principal of every regular public school 
in the state of Florida. Our surveys, for which we consistently exceeded a 70 percent response 
rate, addressed a range of questions on the delivery of instruction in public schools in Florida. 

At the same time, beginning in summer 1999, Florida’s accountability system - the A+ 
Plan for Education - enlisted stigma (the grading of schools on an “A” to “F” scale), oversight 
(by the state of Florida), and competition to spur school improvement. The competitive pressure 
of the system was derived from the Opportunity Scholarship Program: under this system, 
students attending schools that received an “F” grade in multiple years became eligible for a 
voucher that allowed them to attend a private or higher-rated public school. 4 While in effect for 
nearly 7 years, the Opportunity Scholarship Program was declared unconstitutional by Florida’s 
Supreme Court in January 2006. Other components of the A+ Plan, however, remain in effect. 

We analyze the impact of the accountability system on Florida’s students and schools 

using a three-part analysis. First, we estimate the effect of the accountability system and the 

threat of becoming voucher eligible on student test score performance, both in the short-run and 

in the longer term. Second, we study the effects of the reform on school policies and practices. 

Finally, we attempt to determine if the policies appear to affect student achievement or explain 

the change in student performance. We find that student achievement significantly increased in 

elementary schools that received an “F” grade by between 6 to 14 percent of a standard deviation 

4 Currently Florida has two other voucher programs as well: an income tax credit for corporations to fund vouchers 
for low-income students and the McKay Scholarship for students with exceptionalities. In our analysis we control 
for the factors on which eligibility is based for these programs. 




5 



in math and between 6 to 10 percent of a standard deviation in reading in the first year. Three 
years later the impacts persist. 

Importantly, we also detect specific school policy changes implemented by the schools 
that explain part of these increases. Specifically, when faced with increased accountability 
pressure, schools appear to focus on low-performing students, lengthen the amount of time 
devoted to instruction, adopt different ways of organizing the day and learning environment of 
the students and teachers, increase resources available to teachers, and decrease principal control. 
These, combined with other policies, explain more than 15 percent of the test scores gains of 
students in reading and over 38 percent of the test scores gains of students in math, depending on 
the model specification. As such we find evidence that schools respond to accountability 
pressure in educationally meaningful ways. 

II. The Florida School Accountability Program 

Florida’s 1999 A+ Plan for Education introduced a system of school accountability with a 
series of rewards and sanctions for high-performing and low-performing schools. The A+ Plan 
called for annual curriculum-based testing of all students in grades three through ten, and annual 
grading of all public and charter schools based on aggregate test performance. As noted above, 
the Florida accountability system assigns letter grades (“A,” “B,” etc.) to each school based on 
students’ achievement (measured in several ways). High-perfonning and improving schools 
receive rewards while low-perfonning schools receive additional assistance as well as sanctions. 

The assistance provided to low-performing schools primarily consists of three 
components. First, each school district with a school receiving a “D” or “F” is evaluated by a 
“community assessment team” comprised of local and state leaders, including parents and 




